Given two data items, we often need to calculate some measure or metric of how similar they are. For example, this may be used by a clustering algorithm. For discrete valued features this might simply be a count of how many features are identical. For continuous valued feature some distance measure may be used, such as Euclidean distance or Manhatten block distance, but to be a {\em similarity} measure this would usually be inverted in some way (e.g. 1/distance).
Used on pages 133, 134, 176, 201, 212, 214, 284, 388, 444, 528
Also known as similarity metrics